Skip to content

fix: atomic last-owner guard prevents TOCTOU race on role demotion#1590

Merged
rohilsurana merged 10 commits into
mainfrom
fix/toctou-last-owner-guard
May 13, 2026
Merged

fix: atomic last-owner guard prevents TOCTOU race on role demotion#1590
rohilsurana merged 10 commits into
mainfrom
fix/toctou-last-owner-guard

Conversation

@rohilsurana
Copy link
Copy Markdown
Member

@rohilsurana rohilsurana commented Apr 30, 2026

Summary

  • Two concurrent SetOrganizationMemberRole requests to demote the last two owners can both pass the application-level validateMinOwnerConstraint check (each sees 2 owners) and both proceed, leaving the org with zero owners
  • Added DeleteWithMinRoleGuard to the policy repository that uses SELECT FOR UPDATE in a CTE to serialize concurrent deletions, then conditionally deletes only if at least one other owner remains
  • Resource ID and type are derived from the existing policy inside the repository, not from caller input
  • Owner role is fetched once and passed through validateMinOwnerConstraintreplacePolicyWithOwnerGuard, eliminating duplicate role lookups
  • DB delete runs before SpiceDB relation removal to prevent inconsistent state if the guard rejects
  • Existing validateMinOwnerConstraint remains as a fast-path optimization

Test plan

  • Existing membership and policy unit tests pass (updated mocks)
  • Verified concurrent demotions via pg_sleep forced race: without FOR UPDATE both delete (0 owners), with FOR UPDATE second transaction blocks and is rejected (1 owner)

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 30, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
frontier Ready Ready Preview, Comment May 13, 2026 11:08am

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d840aeec-d194-4b41-8630-7a7759c289d4

📥 Commits

Reviewing files that changed from the base of the PR and between ce29b15 and d4c4c52.

📒 Files selected for processing (1)
  • core/membership/service.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • core/membership/service.go

📝 Walkthrough

Summary by CodeRabbit

  • New Features

    • Guarded deletion prevents removing or demoting the last organization owner; such attempts are blocked with a clear error.
    • Cascaded member removals now delete organization-level (owner) policies first and fully clean up related relations and audit records.
  • Bug Fixes

    • Improved handling and error reporting for concurrent demotion/delete races so last-owner conflicts are surfaced consistently.

Walkthrough

Adds guarded policy-deletion (DeleteWithMinRoleGuard) across repository, service, and mocks; a Postgres transactional guarded DELETE; new sentinel error ErrLastRoleGuard; and membership-service changes/tests to use and translate the DB guard into ErrLastOwnerRole during owner demotion/removal cascades.

Changes

Guarded Policy Deletion for Owner Role Protection

Layer / File(s) Summary
Error Definitions
core/policy/errors.go
Adds exported sentinel ErrLastRoleGuard.
Policy Interface & Mocks
core/policy/policy.go, core/policy/mocks/repository.go, core/membership/mocks/policy_service.go
Adds DeleteWithMinRoleGuard(ctx context.Context, id string, guardRoleID string) error to Repository and generated mock implementations/expecters for repository and membership policy service.
Policy Service Layer
core/policy/service.go
Adds Service.DeleteWithMinRoleGuard which calls repository guarded-delete then deletes the corresponding role-binding relation via relationService.Delete.
Policy Repository (DB)
internal/store/postgres/policy_repository.go
Adds DeleteWithMinRoleGuard: loads target policy, runs a transaction with a locked CTE + guarded DELETE ensuring another policy with the guarded role remains, returns policy.ErrLastRoleGuard when blocked, maps missing → policy.ErrNotExist, and writes a PolicyDeleted audit record on success.
Membership Service Integration
core/membership/service.go
Adds PolicyService.DeleteWithMinRoleGuard to the interface; changes SetOrganizationMemberRole to resolve ownerRoleID and pass it to replace/delete flows; introduces cascadeRemovePrincipal and deletePolicy helper to route owner-policy deletions through the guarded API and map policy.ErrLastRoleGuardmembership.ErrLastOwnerRole.
Membership Tests
core/membership/service_test.go
Adds test asserting DB guard triggers membership.ErrLastOwnerRole for concurrent demotion; updates tests to expect DeleteWithMinRoleGuard for owner-policy deletions and refines cascade-deletion failure assertions and ordering.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • raystack/frontier#1537: Extends the core/membership/service.go PolicyService surface that this PR further adds DeleteWithMinRoleGuard(...) to and updates membership flows/tests.
  • raystack/frontier#1531: Related changes to SetOrganizationMemberRole early-return/delete+create behavior that intersect with this PR’s replace/delete flow adjustments.
  • raystack/frontier#1541: Modifies org member role flow and min-owner constraint logic that this PR also changes and extends with guarded deletes.

Suggested reviewers

  • AmanGIT07
  • whoAbhishekSah
  • rsbh
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR aims to prevent a TOCTOU race in organization owner demotions by moving the “must keep at least 1 owner” check into an (intended) atomic delete path in the policy repository, while keeping the existing application-level validation as a fast-path.

Changes:

  • Added DeleteWithMinRoleGuard to the policy repository/service layer and wired it into SetOrganizationMemberRole.
  • Introduced a new policy-layer error (ErrLastRoleGuard) surfaced when a guarded delete would violate the minimum-role constraint.
  • Updated membership unit tests/mocks to use the new guarded delete method.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
internal/store/postgres/policy_repository.go Adds guarded delete SQL intended to atomically prevent deleting the last policy of a role for a resource.
core/policy/service.go Exposes DeleteWithMinRoleGuard at service level and calls into repository.
core/policy/policy.go Extends repository interface with DeleteWithMinRoleGuard.
core/policy/errors.go Adds ErrLastRoleGuard.
core/policy/mocks/repository.go Updates mocks for the new repository method.
core/membership/service.go Switches org role replacement to guarded policy deletion for owner demotions.
core/membership/mocks/policy_service.go Updates mocks for the new policy service method.
core/membership/service_test.go Updates expectations to use guarded deletion and accounts for extra owner-role lookups.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/store/postgres/policy_repository.go Outdated
Comment thread internal/store/postgres/policy_repository.go Outdated
Comment thread internal/store/postgres/policy_repository.go Outdated
Comment thread internal/store/postgres/policy_repository.go Outdated
Comment thread core/policy/service.go Outdated
Comment thread core/membership/service.go Outdated
@coveralls
Copy link
Copy Markdown

coveralls commented Apr 30, 2026

Coverage Report for CI Build 25795292940

Warning

Build has drifted: This PR's base is out of sync with its target branch, so coverage data may include unrelated changes.
Quick fix: rebase this PR. Learn more →

Coverage decreased (-0.04%) to 42.249%

Details

  • Coverage decreased (-0.04%) from the base build.
  • Patch coverage: 80 uncovered changes across 3 files (46 of 126 lines covered, 36.51%).
  • 48 coverage regressions across 1 file.

Uncovered Changes

File Changed Covered %
internal/store/postgres/policy_repository.go 65 0 0.0%
core/policy/service.go 10 0 0.0%
core/membership/service.go 51 46 90.2%

Coverage Regressions

48 previously-covered lines in 1 file lost coverage.

File Lines Losing Coverage Coverage
core/membership/service.go 48 81.01%

Coverage Stats

Coverage Status
Relevant Lines: 37646
Covered Lines: 15905
Line Coverage: 42.25%
Coverage Strength: 11.88 hits per line

💛 - Coveralls

Comment thread core/policy/service.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/policy/service.go
Comment thread internal/store/postgres/policy_repository.go
Comment thread core/membership/service.go Outdated
Comment thread core/membership/service.go
Comment thread core/membership/service_test.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread core/membership/service.go Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/store/postgres/policy_repository.go
Comment thread core/membership/service.go Outdated
Comment thread core/membership/service.go Outdated
Comment thread core/policy/service.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
core/membership/service_test.go (1)

775-789: ⚡ Quick win

Add a happy-path RemoveOrganizationMember test that hits the guarded delete branch.

The new removal assertions cover the reject path and plain-delete failures, but there is still no case that expects DeleteWithMinRoleGuard during a successful org-owner removal. That branch is the new phase-ordering behavior and is currently easy to regress unnoticed.


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bf6c45a2-8320-48a5-94d6-a6ad4db445ee

📥 Commits

Reviewing files that changed from the base of the PR and between cb27b57 and 20f5d24.

📒 Files selected for processing (8)
  • core/membership/mocks/policy_service.go
  • core/membership/service.go
  • core/membership/service_test.go
  • core/policy/errors.go
  • core/policy/mocks/repository.go
  • core/policy/policy.go
  • core/policy/service.go
  • internal/store/postgres/policy_repository.go

Comment thread core/policy/service.go
Comment thread internal/store/postgres/policy_repository.go
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b47f33b1-63ca-4643-aaed-63cdd8eb4b01

📥 Commits

Reviewing files that changed from the base of the PR and between 20f5d24 and 139c56f.

📒 Files selected for processing (1)
  • internal/store/postgres/policy_repository.go

Comment thread internal/store/postgres/policy_repository.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Comment thread internal/store/postgres/policy_repository.go
Comment thread core/policy/service.go
Comment thread core/membership/service.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
internal/store/postgres/policy_repository.go (1)

366-442: ⚡ Quick win

Add a Postgres integration test for the FOR UPDATE serialization.

The Coveralls report shows 0% patch coverage on this file (0/64 changed lines). For a guard whose entire correctness argument rests on row-locking under concurrent demotions, the existing pg_sleep-based manual test described in the PR description should be promoted into the repository's docker-postgres integration suite — otherwise a future refactor (e.g., dropping FOR UPDATE, reordering the CTE, changing the predicate) can silently regress the TOCTOU fix without any signal in CI.

Suggested coverage:

  • Two concurrent DeleteWithMinRoleGuard calls against the only two guarded-role rows → exactly one returns ErrLastRoleGuard, the other succeeds, and the resource ends with exactly one guarded-role policy.
  • Concurrent delete of the same id from two transactions → one succeeds, the other returns ErrNotExist (verifies the existence re-check path).
  • Demotion when target's role_id != guardRoleID → unconditional delete succeeds even when count of guarded-role rows is 0.
  • Guard rejection when target is the sole guarded-role holder → returns ErrLastRoleGuard and row is preserved.

Want me to draft the integration test scaffolding using the existing repository test harness?


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 013354a8-2a05-40ef-b061-7cc3d6740636

📥 Commits

Reviewing files that changed from the base of the PR and between 139c56f and 7ee4792.

📒 Files selected for processing (1)
  • internal/store/postgres/policy_repository.go

@AmanGIT07
Copy link
Copy Markdown
Contributor

AmanGIT07 commented May 13, 2026

@rohilsurana PR looks good, a few points worth looking:

1. RemoveOrganizationMember tests don't exercise the new guard path.
The second policySvc.List mocks in the cascade tests return policies without RoleID, so the Phase 2 check pol.RoleID == ownerRoleID is always false and DeleteWithMinRoleGuard is silently skipped. Set RoleID: ownerRoleID on the org policy in those mocks and switch the expectation to DeleteWithMinRoleGuard. Add a test mirroring the new SetOrganizationMemberRole case where the repo returns ErrLastRoleGuard and we expect ErrLastOwnerRole.

2. No integration test for the new SQL.
policy_repository_test.go already uses dockertest; please add TestDeleteWithMinRoleGuard covering: happy path, last-owner rejection, concurrent demotion (two goroutines + pg_sleep), non-guard-role policy (the defensive branch), and concurrent deletion between Get and the txn.

3. Retry-semantics regression in RemoveOrganizationMember.
The pre-PR ordering deleted org policies last so a partial failure in sub-resource cleanup or relations was retryable — the org policy still existed, so membership check passed. The new ordering deletes the org owner policy first, so any failure in Phase 3/4 leaves orphaned policies/relations and a retry returns ErrNotMember. Either document this in the godoc and provide operator cleanup tooling, or accept it as a known trade-off.

4. No log on partial failure in policy.Service.DeleteWithMinRoleGuard. If DB commits but relationService.Delete fails, SpiceDB has stale bearer/role tuples for a now-deleted policy. Mirror the structured log used in replacePolicyWithOwnerGuard.

Suggested refactor

RemoveOrganizationMember reads as four numbered phases inline; the natural shape is validate → guard → cascade. Two extractions cut ~70 lines and remove the Phase N comments.

1. Extract the cascade. RemoveOrganizationMember keeps only validation + the guard loop + audit; everything else moves into cascadeRemovePrincipal(ctx, org, principalID, principalType, ownerRoleID) which classifies, deletes non-owner org + sub-resource policies, and removes SpiceDB relations.

2. Merge replacePolicyWithOwnerGuard and replacePolicy into one method + a deletePolicy helper:

func (s *Service) replacePolicy(ctx context.Context, ..., existing []policy.Policy, ownerRoleID string) error {                                                                        
    for _, p := range existing {                                                                                                                                                         
        if err := s.deletePolicy(ctx, p, ownerRoleID); err != nil {
            if errors.Is(err, policy.ErrLastRoleGuard) { return ErrLastOwnerRole }                                                                                                       
            return fmt.Errorf("delete policy %s: %w", p.ID, err)                                                                                                                         
        }
    }                                                                                                                                                                                    
    // ... createPolicy                                                                                                                                                                  
}
                                                                                                                                                                                         
func (s *Service) deletePolicy(ctx context.Context, pol policy.Policy, ownerRoleID string) error {                                                                                     
    if ownerRoleID != "" && pol.RoleID == ownerRoleID {
        return s.policyService.DeleteWithMinRoleGuard(ctx, pol.ID, ownerRoleID)                                                                                                          
    }                                                                                                                                                                                    
    return s.policyService.Delete(ctx, pol.ID)                                                                                                                                           
}                                                                                                                                                                                        
                                                                                                                                                                                                                                                    

Add DeleteWithMinRoleGuard to the policy repository that atomically
checks the owner count within the DELETE statement itself. This
eliminates the race where two concurrent demotion requests both pass
the application-level owner count check then both proceed.

The SQL condition ensures the DELETE only executes if the policy
being removed is either not the guarded role, or at least one other
policy with that role exists for the same resource. If the condition
fails (last owner), zero rows are affected and ErrLastRoleGuard is
returned.

The existing validateMinOwnerConstraint remains as a fast-path
optimization to avoid unnecessary DB round-trips.
1. Flip SpiceDB/DB ordering — DB delete first, then SpiceDB relation
   removal. Prevents inconsistent state if the guard rejects the delete.

2. Use SELECT FOR UPDATE in CTE to serialize concurrent deletions under
   READ COMMITTED isolation. Two concurrent DELETEs of different owner
   rows now block on the row lock instead of both proceeding.

3. Use TABLE_POLICIES constant instead of hardcoded "policies" string.

4. Derive resource_id/resource_type from existingPolicy inside the
   repository instead of trusting caller-supplied values.

5. Fetch owner role once — validateMinOwnerConstraint now returns the
   resolved ownerRoleID for reuse, eliminating the duplicate Get call.
- Only use guarded delete for owner policies, plain Delete for non-owner
  (avoids unnecessary locking/serialization)
- Apply guarded delete in RemoveOrganizationMember flow too (same TOCTOU)
- Add unit test for ErrLastRoleGuard → ErrLastOwnerRole mapping
- Fix test expectations for non-owner policy deletions
…OrganizationMember

If the guard rejects (last owner), we now return early before touching
any SpiceDB relations, preventing partial state modification on rejection.
Restructured RemoveOrganizationMember into 3 phases:
1. Classify policies by type (no deletions)
2. Guarded org owner delete (rejects early if last owner)
3. Remaining org policies, sub-resource policies, then relations

This ensures guard rejection leaves no partial state.
When rowsAffected==0, re-check existence inside the transaction. If the
row is gone (concurrent delete), return sql.ErrNoRows which maps to
ErrNotExist. Only return ErrLastRoleGuard when the row still exists but
the guard condition prevented deletion.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 68febd0f-7bec-43a2-8be7-358532e2e0c3

📥 Commits

Reviewing files that changed from the base of the PR and between 7ee4792 and ce29b15.

📒 Files selected for processing (8)
  • core/membership/mocks/policy_service.go
  • core/membership/service.go
  • core/membership/service_test.go
  • core/policy/errors.go
  • core/policy/mocks/repository.go
  • core/policy/policy.go
  • core/policy/service.go
  • internal/store/postgres/policy_repository.go
✅ Files skipped from review due to trivial changes (2)
  • core/membership/mocks/policy_service.go
  • core/policy/mocks/repository.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • core/policy/errors.go
  • core/policy/policy.go
  • core/policy/service.go
  • internal/store/postgres/policy_repository.go
  • core/membership/service_test.go

Comment thread core/membership/service.go
Comment thread core/membership/service.go
@rohilsurana rohilsurana merged commit 0d0ca1d into main May 13, 2026
8 checks passed
@rohilsurana rohilsurana deleted the fix/toctou-last-owner-guard branch May 13, 2026 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants